卷积神经网络(CNN)的一个问题是,它们需要大型数据集来获得足够的鲁棒性。在小型数据集上,它们容易过度拟合。已经提出了许多方法来克服CNN的缺点。如果无法轻易收集其他样本,则一种常见的方法是使用增强技术从现有数据中生成更多数据点。在图像分类中,许多增强方法都使用简单的图像操纵算法。在这项工作中,我们通过添加通过组合14种增强方法生成的图像来构建合奏,其中第一次提出了其中三种。这些新型方法基​​于傅立叶变换(FT),ra transform(RT)和离散余弦变换(DCT)。预处理的RESNET50网络在训练集上进行了填充,其中包括从每种增强方法中得出的图像。这些网络和几个融合均在11个基准测试中进行了评估和比较。结果表明,通过组合不同的数据增强方法来产生分类器,这些分类器不仅可以与最新技术竞争,而且经常超过文献中报告的最佳方法,从而在数据级上建立合奏。
translated by 谷歌翻译
语义分割包括通过将其分配给从一组所有可用的标签来分类图像的每个像素。在过去的几年里,很多关注转移到这种任务。许多计算机视觉研究人员试图应用AutoEncoder结构来开发可以学习图像语义的模型以及它的低级表示。在给定输入的AutoEncoder架构中,编码器计算的输入的低维表示,然后解码器用于重建原始数据。在这项工作中,我们提出了一个卷积神经网络(CNNS)的集合。在集合方法中,许多不同的型号训练,然后用于分类,整体汇总了单个分类器的输出。该方法利用各种分类器的差异来提高整个系统的性能。通过使用不同的丢失函数强制执行单个分类器中的多样性。特别是,我们提出了一种新的损失函数,从骰子和结构相似度指数的组合产生。通过使用Deeplabv3 +和Hardnet环境结合不同的骨干网络来实现所提出的合奏。该提案是通过关于两个真实情景的广泛实证评估来评估:息肉和皮肤细分。所有代码都在HTTPS://github.com/lorisnanni在线提供。
translated by 谷歌翻译
Multilabel学习解决与多个类标签相关联的问题。这项工作提出了一种用于管理Multilabel分类的新集合方法:所提出的方法的核心结合了一组门控经常性单元和临时卷曲的临时卷积神经网络,这些单位与ADAM优化方法的变体训练。比较和测试的多个ADAM变体,包括在此提出的新颖之一;这些变型基于当前和过去梯度之间的差异,对于每个参数调整步长调整。所提出的神经网络方法也与包含多个聚类中心(IMCC)结合,这进一步提升了分类性能。九种数据集的多个实验代表各种多标签任务的多种实验证明了我们最好的合奏的稳健性,这被证明可以优于最先进的。用于在实验部分中生成最佳合奏的MATLAB代码将在https://github.com/lorisnanni获得。
translated by 谷歌翻译
音频数据增强是培训深度神经网络以解决音频分类任务的关键步骤。在本文中,我们在Matlab中引入了一个新型音频数据增强库的录音机。我们为RAW音频数据提供了15种不同的增强算法,8用于频谱图。我们有效地实施了几种增强技术,其有用性在文献中被广泛证明。据我们所知,这是最大的Matlab音频数据增强图书馆可自由使用。我们验证了我们在ESC-50数据集上评估它们的算法的效率。可以在https://github.com/lorisnanni/audiogmenter下载工具箱及其文档。
translated by 谷歌翻译
With more and more data being collected, data-driven modeling methods have been gaining in popularity in recent years. While physically sound, classical gray-box models are often cumbersome to identify and scale, and their accuracy might be hindered by their limited expressiveness. On the other hand, classical black-box methods, typically relying on Neural Networks (NNs) nowadays, often achieve impressive performance, even at scale, by deriving statistical patterns from data. However, they remain completely oblivious to the underlying physical laws, which may lead to potentially catastrophic failures if decisions for real-world physical systems are based on them. Physically Consistent Neural Networks (PCNNs) were recently developed to address these aforementioned issues, ensuring physical consistency while still leveraging NNs to attain state-of-the-art accuracy. In this work, we scale PCNNs to model building temperature dynamics and propose a thorough comparison with classical gray-box and black-box methods. More precisely, we design three distinct PCNN extensions, thereby exemplifying the modularity and flexibility of the architecture, and formally prove their physical consistency. In the presented case study, PCNNs are shown to achieve state-of-the-art accuracy, even outperforming classical NN-based models despite their constrained structure. Our investigations furthermore provide a clear illustration of NNs achieving seemingly good performance while remaining completely physics-agnostic, which can be misleading in practice. While this performance comes at the cost of computational complexity, PCNNs on the other hand show accuracy improvements of 17-35% compared to all other physically consistent methods, paving the way for scalable physically consistent models with state-of-the-art performance.
translated by 谷歌翻译
Reinforcement Learning (RL) generally suffers from poor sample complexity, mostly due to the need to exhaustively explore the state space to find good policies. On the other hand, we postulate that expert knowledge of the system to control often allows us to design simple rules we expect good policies to follow at all times. In this work, we hence propose a simple yet effective modification of continuous actor-critic RL frameworks to incorporate such prior knowledge in the learned policies and constrain them to regions of the state space that are deemed interesting, thereby significantly accelerating their convergence. Concretely, we saturate the actions chosen by the agent if they do not comply with our intuition and, critically, modify the gradient update step of the policy to ensure the learning process does not suffer from the saturation step. On a room temperature control simulation case study, these modifications allow agents to converge to well-performing policies up to one order of magnitude faster than classical RL agents while retaining good final performance.
translated by 谷歌翻译
Just like in humans vision plays a fundamental role in guiding adaptive locomotion, when designing the control strategy for a walking assistive technology, Computer Vision may bring substantial improvements when performing an environment-based assistance modulation. In this work, we developed a hip exosuit controller able to distinguish among three different walking terrains through the use of an RGB camera and to adapt the assistance accordingly. The system was tested with seven healthy participants walking throughout an overground path comprising of staircases and level ground. Subjects performed the task with the exosuit disabled (Exo Off), constant assistance profile (Vision Off ), and with assistance modulation (Vision On). Our results showed that the controller was able to promptly classify in real-time the path in front of the user with an overall accuracy per class above the 85%, and to perform assistance modulation accordingly. Evaluation related to the effects on the user showed that Vision On was able to outperform the other two conditions: we obtained significantly higher metabolic savings than Exo Off, with a peak of about -20% when climbing up the staircase and about -16% in the overall path, and than Vision Off when ascending or descending stairs. Such advancements in the field may yield to a step forward for the exploitation of lightweight walking assistive technologies in real-life scenarios.
translated by 谷歌翻译
Despite the immense success of neural networks in modeling system dynamics from data, they often remain physics-agnostic black boxes. In the particular case of physical systems, they might consequently make physically inconsistent predictions, which makes them unreliable in practice. In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most multi-physics systems, and rely on Neural Ordinary Differential Equations (NODEs) to learn their parameters from data. Since IPHS models are consistent with the first and second principles of thermodynamics by design, so are the proposed Physically Consistent NODEs (PC-NODEs). Furthermore, the NODE training procedure allows us to seamlessly incorporate prior knowledge of the system properties in the learned dynamics. We demonstrate the effectiveness of the proposed method by learning the thermodynamics of a building from the real-world measurements and the dynamics of a simulated gas-piston system. Thanks to the modularity and flexibility of the IPHS framework, PC-NODEs can be extended to learn physically consistent models of multi-physics distributed systems.
translated by 谷歌翻译
数据集通常由于人为错误和社会偏见而包含不准确性,这些不准确性会影响在此类数据集上训练的模型的结果。我们提出了一种用于证明线性回归模型是否在训练数据集中标记偏差的技术,即是否将扰动与培训数据集的标签有界化导致改变测试点预测的模型。我们展示了如何为单个测试点确切解决此问题,并提供了一种近似但更可扩展的方法,该方法不需要提前了解测试点。我们广泛评估这两种技术,并发现基于回归和分类的线性模型通常显示出高水平的偏见。但是,我们还发现了偏见的差距,例如某些数据集上某些偏差假设的高水平的非舒适性。总体而言,我们的方法可以作为何时信任或提问模型的输出的指南。
translated by 谷歌翻译
这封信报告了一种新型手持机器人的设计,构造和实验验证,用于在人声褶皱的办公室激光手术中。办公室内窥镜激光手术是喉咙学的一种新兴趋势:它有望以成本的一小部分提供相同的传统手术治疗(即手术室)的患者结局。不幸的是,办公室程序可能具有挑战性。用于激光输送的光纤只能以视线方式向前发出光,这严重限制了解剖学访问。我们在这封信中提出的机器人旨在克服这些挑战。机器人的最终效应子是可通的激光纤维,通过将薄光纤纤维(0.225 mm)与肌腱驱动的镍氨基烷凹口鞘的组合组合而产生,可提供弯曲。该设备可以与大多数市售的内窥镜无缝使用,因为它足够小(1.1 mm)可以通过工作通道。为了控制纤维,我们提出了一个可以安装在内窥镜手柄顶部的紧凑型致动单元,以便在手术过程中,操作医生可以单手同时操作内窥镜和可驾驶的纤维。我们报告了模拟和幻影实验,表明与当前的临床纤维相比,该提议的设备大大增强了手术通道。
translated by 谷歌翻译